Presentation Video Retrieval using Automatically Recovered Slide and Spoken Text
نویسنده
چکیده
Video is becoming a prevalent medium for e-learning. Lecture videos contain text information in both the visual and aural channels: the presentation slides and lecturer’s speech. This paper examines the relative utility of automatically recovered text from these sources for lecture video retrieval. To extract the visual information, we apply video content analysis to detect slides and optical character recognition to obtain their text. Automatic speech recognition is used similarly to extract spoken text from the recorded audio. We perform controlled experiments with manually created ground truth for both the slide and spoken text from more than 60 hours of lecture video. We compare the automatically extracted slide and spoken text in terms of accuracy relative to ground truth, overlap with one another, and utility for video retrieval. Results reveal that automatically recovered slide text and spoken text contain different content with varying error profiles. Experiments demonstrate that automatically extracted slide text enables higher precision video retrieval than automatically recovered spoken text.
منابع مشابه
Multimedia surrogates for video gisting: Toward combining spoken words and imagery
Good surrogates that allow people to quickly derive the gist of videos without taking the time to view the full video are crucial to video retrieval and browsing systems. Although there are many kinds of textual and visual surrogates used in video retrieval systems, there are few audio surrogates in practice. To evaluate the effectiveness of audio surrogates alone and in combination with one ki...
متن کاملA Korean Spoken Document Retrieval System for Lecture Search
In this paper, we introduced a Korean spoken document retrieval system for lecture search. We automatically build a general inverted index table from spoken document transcriptions, and we extract additional information from textbooks or slide notes related to the lecture. We integrate these two sources for a search process. The speech corpus used in our system is from a highschool mathematics ...
متن کاملLinking Presentation Documents Using Image Analysis
Systems for recording presentations are becoming commonly available. Commercial solutions include authoring tools that let users create online representations by recording audio, video, and presentation slides while a talk is being given. A typical collection of presentation recordings may contain hundreds even thousands of recordings, making it very difficult to retrieve particular presentatio...
متن کاملSemiautomatic Image Retrieval Using the High Level Semantic Labels
Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملSpeech Recognition and Information Retrieval: Experiments in Retrieving Spoken Documents
The Informedia Digital Video Library Project at Carnegie Mellon University is making large corpora of video and audio data available for full content retrieval by integrating natural language understanding, image processing, speech recognition and information retrieval. Information retrieval of from corpora of speech recognition output is critical to the project’s success. In this paper, we out...
متن کامل